quantile level
- Research Report > New Finding (0.67)
- Research Report > Experimental Study (0.46)
Calibrated Multi-Level Quantile Forecasting
Ding, Tiffany, Gibbs, Isaac, Tibshirani, Ryan J.
We present an online method for guaranteeing calibration of quantile forecasts at multiple quantile levels simultaneously. A sequence of $α$-level quantile forecasts is calibrated if the forecasts are larger than the target value at an $α$-fraction of time steps. We introduce a lightweight method called Multi-Level Quantile Tracker (MultiQT) that wraps around any existing point or quantile forecaster to produce corrected forecasts guaranteed to achieve calibration, even against adversarial distribution shifts, while ensuring that the forecasts are ordered -- e.g., the 0.5-level quantile forecast is never larger than the 0.6-level forecast. Furthermore, the method comes with a no-regret guarantee that implies it will not worsen the performance of an existing forecaster, asymptotically, with respect to the quantile loss. In experiments, we find that MultiQT significantly improves the calibration of real forecasters in epidemic and energy forecasting problems.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Vermont (0.04)
- North America > United States > Texas (0.04)
- (3 more...)
- Health & Medicine (1.00)
- Energy > Power Industry (1.00)
- Energy > Renewable > Solar (0.94)
- (2 more...)
Prediction Markets with Intermittent Contributions
Vitali, Michael, Pinson, Pierre
Although both data availability and the demand for accurate forecasts are increasing, collaboration between stakeholders is often constrained by data ownership and competitive interests. In contrast to recent proposals within cooperative game-theoretical frameworks, we place ourselves in a more general framework, based on prediction markets. There, independent agents trade forecasts of uncertain future events in exchange for rewards. We introduce and analyse a prediction market that (i) accounts for the historical performance of the agents, (ii) adapts to time-varying conditions, while (iii) permitting agents to enter and exit the market at will. The proposed design employs robust regression models to learn the optimal forecasts' combination whilst handling missing submissions. Moreover, we introduce a pay-off allocation mechanism that considers both in-sample and out-of-sample performance while satisfying several desirable economic properties. Case-studies using simulated and real-world data allow demonstrating the effectiveness and adaptability of the proposed market design.
- North America > United States (0.94)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Denmark (0.04)
- Europe > Belgium (0.04)
- Research Report (0.82)
- Overview (0.68)
- Banking & Finance > Trading (1.00)
- Government > Regional Government > North America Government > United States Government (0.94)
- Energy > Renewable > Wind (0.69)
Federated Learning of Quantile Inference under Local Differential Privacy
Cai, Leheng, Hu, Qirui, Wu, Shuyuan
In this paper, we investigate federated learning for quantile inference under local differential privacy (LDP). We propose an estimator based on local stochastic gradient descent (SGD), whose local gradients are perturbed via a randomized mechanism with global parameters, making the procedure tolerant of communication and storage constraints without compromising statistical efficiency. Although the quantile loss and its corresponding gradient do not satisfy standard smoothness conditions typically assumed in existing literature, we establish asymptotic normality for our estimator as well as a functional central limit theorem. The proposed method accommodates data heterogeneity and allows each server to operate with an individual privacy budget. Furthermore, we construct confidence intervals for the target value through a self-normalization approach, thereby circumventing the need to estimate additional nuisance parameters. Extensive numerical experiments and real data application validate the theoretical guarantees of the proposed methodology.
- North America > United States (0.28)
- Asia > Middle East > Jordan (0.04)
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Research Report > New Finding (0.67)
- Research Report > Experimental Study (0.46)
Smoothed SGD for quantiles: Bahadur representation and Gaussian approximation
Chen, Likai, Keilbar, Georg, Wu, Wei Biao
This paper considers the estimation of quantiles via a smoothed version of the stochastic gradient descent (SGD) algorithm. By smoothing the score function in the conventional SGD quantile algorithm, we achieve monotonicity in the quantile level in that the estimated quantile curves do not cross. We derive non-asymptotic tail probability bounds for the smoothed SGD quantile estimate both for the case with and without Polyak-Ruppert averaging. For the latter, we also provide a uniform Bahadur representation and a resulting Gaussian approximation result. Numerical studies show good finite sample behavior for our theoretical results.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Missouri > St. Louis County > St. Louis (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Berlin (0.04)
On Quantile Regression Forests for Modelling Mixed-Frequency and Longitudinal Data
The aim of this thesis is to extend the applications of the Quantile Regression Forest (QRF) algorithm to handle mixed-frequency and longitudinal data. To this end, standard statistical approaches have been exploited to build two novel algorithms: the Mixed- Frequency Quantile Regression Forest (MIDAS-QRF) and the Finite Mixture Quantile Regression Forest (FM-QRF). The MIDAS-QRF combines the flexibility of QRF with the Mixed Data Sampling (MIDAS) approach, enabling non-parametric quantile estimation with variables observed at different frequencies. FM-QRF, on the other hand, extends random effects machine learning algorithms to a QR framework, allowing for conditional quantile estimation in a longitudinal data setting. The contributions of this dissertation lie both methodologically and empirically. Methodologically, the MIDAS-QRF and the FM-QRF represent two novel approaches for handling mixed-frequency and longitudinal data in QR machine learning framework. Empirically, the application of the proposed models in financial risk management and climate-change impact evaluation demonstrates their validity as accurate and flexible models to be applied in complex empirical settings.
- Europe > United Kingdom (0.46)
- Asia (0.45)
- North America > United States > Texas (0.13)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Energy > Oil & Gas > Upstream (1.00)
- Energy > Oil & Gas > Trading (1.00)
- Banking & Finance > Trading (1.00)
- (4 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)
Efficient distributional regression trees learning algorithms for calibrated non-parametric probabilistic forecasts
Quentin, Duchemin, Guillaume, Obozinski
The perspective of developing trustworthy AI for critical applications in science and engineering requires machine learning techniques that are capable of estimating their own uncertainty. In the context of regression, instead of estimating a conditional mean, this can be achieved by producing a predictive interval for the output, or to even learn a model of the conditional probability $p(y|x)$ of an output $y$ given input features $x$. While this can be done under parametric assumptions with, e.g. generalized linear model, these are typically too strong, and non-parametric models offer flexible alternatives. In particular, for scalar outputs, learning directly a model of the conditional cumulative distribution function of $y$ given $x$ can lead to more precise probabilistic estimates, and the use of proper scoring rules such as the weighted interval score (WIS) and the continuous ranked probability score (CRPS) lead to better coverage and calibration properties. This paper introduces novel algorithms for learning probabilistic regression trees for the WIS or CRPS loss functions. These algorithms are made computationally efficient thanks to an appropriate use of known data structures - namely min-max heaps, weight-balanced binary trees and Fenwick trees. Through numerical experiments, we demonstrate that the performance of our methods is competitive with alternative approaches. Additionally, our methods benefit from the inherent interpretability and explainability of trees. As a by-product, we show how our trees can be used in the context of conformal prediction and explain why they are particularly well-suited for achieving group-conditional coverage guarantees.
- Europe > Switzerland > Zürich > Zürich (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine (0.67)
- Leisure & Entertainment > Games (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
Quantifying Uncertainty and Variability in Machine Learning: Confidence Intervals for Quantiles in Performance Metric Distributions
Lehmann, Christoph, Paromau, Yahor
Machine learning models are widely used in applications where reliability and robustness are critical. Model evaluation often relies on single-point estimates of performance metrics such as accuracy, F1 score, or mean squared error, that fail to capture the inherent variability in model performance. This variability arises from multiple sources, including train-test split, weights initialization, and hyperparameter tuning. Investigating the characteristics of performance metric distributions, rather than focusing on a single point only, is essential for informed decision-making during model selection and optimization, especially in high-stakes settings. How does the performance metric vary due to intrinsic uncertainty in the selected modeling approach? For example, train-test split is modified, initial weights for optimization are modified or hyperparameter tuning is done using an algorithm with probabilistic nature? This is shifting the focus from identifying a single best model to understanding a distribution of the performance metric that captures variability across different training conditions. By running multiple experiments with varied settings, empirical distributions of performance metrics can be generated. Analyzing these distributions can lead to more robust models that generalize well across diverse scenarios. This contribution explores the use of quantiles and confidence intervals to analyze such distributions, providing a more complete understanding of model performance and its uncertainty. Aimed at a statistically interested audience within the machine learning community, the suggested approaches are easy to implement and apply to various performance metrics for classification and regression problems. Given the often long training times in ML, particular attention is given to small sample sizes (in the order of 10-25).
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Reviews: The Multiple Quantile Graphical Model
In this paper titled "The Multiple Quantile Graphical Model", the authors introduced the MQGM, which offers a much broader class of conditional distribution estimates by introducing a set of quantile levels . One key contribution of this paper is that the proposed MQGM asymptotically identifies the exact conditional independencies under some conditions, as the size of the graph grows. There is some empirical results of the proposed algorithm and other alternatives. The strength of the paper: an interesting and important problem the idea is simple and sound theoretical proof My concern is mainly for significance of the key idea. The proposed method is very natural, and at a high level, introducing a set of quantile levels and providing more expressive class of estimates, which is not new. The main novelty here might be using multiple quantiles for non-parametric, continuous data.